Search CORE

79 research outputs found

Wavefront Marching Methods: A Unified Algorithm to Solve Eikonal and Static Hamilton-Jacobi Equations

Author: Alonso-Betanzos Amparo
Cancela Brais
Publication venue: IEEE
Publication date: 01/12/2019
Field of study

© 2020 IEEE. This version of the article has been accepted for publication, after peer review. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The Version of Record is available online at: https://doi.org/10.1109/TPAMI.2020.2993500[Abstract]: This paper presents a unified propagation method for dealing with both the classic Eikonal equation, where the motion direction does not affect the propagation, and the more general static Hamilton-Jacobi equations, where it does. While classic Fast Marching Method (FMM) techniques achieve the solution to the Eikonal equation with a O(M log M) (or O(M) assuming some modifications), solving the more general static Hamilton-Jacobi equation requires a higher complexity. The proposed framework maintains the O(M log M) complexity for both problems, while achieving higher accuracy than available state-of-the-art. The key idea behind the proposed method is the creation of ‘mini wave-fronts’, where the solution is interpolated to minimize the discretization error. Experimental results show how our algorithm can outperform the state-of-the-art both in precision and computational cost.The authors would like to thank to the financial support of the Spanish Ministerio de Economıa y Competitividad (research project TIN2015-65069-C2-1-R), the Xunta de Galicia (research projects ED431C 2018/34 and Centro Singular de Investigacion de Galicia, accreditation 2016-2019) and by the European Union (European Regional Development Fund). Brais Cancela acknowledges the support of the Xunta de Galicia under its postdoctoral program.Xunta de Galicia; ED431C 2018/3

Repositorio da Universidade da Coruña

A scalable saliency-based Feature selection method with instance level information

Author: Alonso-Betanzos Amparo
Bolón-Canedo Verónica
Cancela Brais
Gama João
Publication venue: 'Elsevier BV'
Publication date: 30/04/2019
Field of study

Classic feature selection techniques remove those features that are either irrelevant or redundant, achieving a subset of relevant features that help to provide a better knowledge extraction. This allows the creation of compact models that are easier to interpret. Most of these techniques work over the whole dataset, but they are unable to provide the user with successful information when only instance information is needed. In short, given any example, classic feature selection algorithms do not give any information about which the most relevant information is, regarding this sample. This work aims to overcome this handicap by developing a novel feature selection method, called Saliency-based Feature Selection (SFS), based in deep-learning saliency techniques. Our experimental results will prove that this algorithm can be successfully used not only in Neural Networks, but also under any given architecture trained by using Gradient Descent techniques

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

Distributed Correlation-Based Feature Selection in Spark

Author: Alonso-Betanzos Amparo
de-Marcos Luis
Palma-Mendoza Raul-Jose
Rodriguez Daniel
Publication venue: 'Elsevier BV'
Publication date: 31/01/2019
Field of study

CFS (Correlation-Based Feature Selection) is an FS algorithm that has been successfully applied to classification problems in many domains. We describe Distributed CFS (DiCFS) as a completely redesigned, scalable, parallel and distributed version of the CFS algorithm, capable of dealing with the large volumes of data typical of big data applications. Two versions of the algorithm were implemented and compared using the Apache Spark cluster computing model, currently gaining popularity due to its much faster processing times than Hadoop's MapReduce model. We tested our algorithms on four publicly available datasets, each consisting of a large number of instances and two also consisting of a large number of features. The results show that our algorithms were superior in terms of both time-efficiency and scalability. In leveraging a computer cluster, they were able to handle larger datasets than the non-distributed WEKA version while maintaining the quality of the results, i.e., exactly the same features were returned by our algorithms when compared to the original algorithm available in WEKA.Comment: 25 pages, 5 figure

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

On developing an automatic threshold applied to feature selection ensembles

Author: Alonso-Betanzos Amparo
Bolón-Canedo Verónica
Seijo Pardo Borja
Publication venue: Elsevier
Publication date: 01/01/2019
Field of study

© 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/. This version of the article "R.-J. Palma-Mendoza, L. de-Marcos, D. Rodriguez, y A. Alonso-Betanzos, «B. Seijo-Pardo, V. Bolón-Canedo, y A. Alonso-Betanzos, «On developing an automatic threshold applied to feature selection ensembles», Information Fusion, vol. 45, pp. 227-245, ene. 2019" has been accepted for publication in Information Fusion. The Version of Record is available online at https://doi.org/10.1016/j.inffus.2018.02.007[Abstract]: Feature selection ensemble methods are a recent approach aiming at adding diversity in sets of selected features, improving performance and obtaining more robust and stable results. However, using an ensemble introduces the need for an aggregation step to combine all the output methods that confirm the ensemble. Besides, when trying to improve computational efficiency, ranking methods that order all initial features are preferred, and so an additional thresholding step is also mandatory. In this work two different ensemble designs based on ranking methods are described. The main difference between them is the order in which the combination and thresholding steps are performed. In addition, a new automatic threshold based on the combination of three data complexity measures is proposed and compared with traditional thresholding approaches based on retaining a fixed percentage of features. The behavior of these methods was tested, according to the SVM classification accuracy, with satisfactory results, for three different scenarios: synthetic datasets and two types of real datasets (where sample size is much higher than feature size, and where feature size is much higher than sample size).This research has been financially supported in part by the Spanish Ministerio de Economa y Competitividad (research project TIN 2015-65069-C2-1-R), by the Xunta de Galicia (research projects GRC2014/035 and the Centro Singular de Investigación de Galicia, accreditation 2016–2019) and by the European Union (FEDER/ERDF).Xunta de Galicia; GRC2014/03

Repositorio da Universidade da Coruña

Low-Precision Feature Selection on Microarray Data: An Information Theoretic Approach

Author: Alonso-Betanzos Amparo
Bolón-Canedo Verónica
Morán-Fernández Laura
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] The number of interconnected devices, such as personal wearables, cars, and smart-homes, surrounding us every day has recently increased. The Internet of Things devices monitor many processes, and have the capacity of using machine learning models for pattern recognition, and even making decisions, with the added advantage of diminishing network congestion by allowing computations near to the data sources. The main restriction is the low computation capacity of these devices. Thus, machine learning algorithms capable of maintaining accuracy while using mechanisms that exploit certain characteristics, such as low-precision versions, are needed. In this paper, low-precision mutual information-based feature selection algorithms are employed over DNA microarray datasets, showing that 16-bit and some times even 8-bit representations of these algorithms can be used without significant variations in the final classification results achieved.This work has been supported by the grant Machine Learning on the Edge - Ayudas Fundación BBVA a Equipos de Investigación Científica 2019. It has also been possible thanks to the support received by the National Plan for Scientific and Technical Research and Innovation of the Spanish Government (Grant PID2019-109238GB-C2), and by the Xunta de Galicia (Grant ED431C 2018/34) with the European Union ERDF funds. CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia”, supported in an 80% through ERDF Funds, ERDF Operational Programme Galicia 2014-2020, and the remaining 20% by “Secretaría Xeral de Universidades” (Grant ED431G 2019/01). Open Access funding provided thanks to the CRUE-CSIC agreement with Springer NatureXunta de Galicia; ED431C 2018/34Xunta de Galicia; ED431G 2019/0

Repositorio da Universidade da Coruña

Metodología de trabajo y experiencias de aprendizaje colaborativo y evaluación continua en la disciplina de Sistemas Multiagente

Author: Alonso-Betanzos Amparo
Bellas Bouza Francisco
Publication venue: Universidad de Zaragoza. Escuela Universitaria Politécnica de Teruel
Publication date: 01/01/2007
Field of study

En este trabajo se exponen las experiencias realizadas para la adaptación al Espacio Europeo de Educación Superior de la asignatura Sistemas Expertos, de la titulación de Ingeniería Informática en la Universidad de A Coruña. El nuevo planteamiento se centra principalmente en la realización de actividades colaborativas, la incorporación de recursos virtuales y el sistema de evaluación continua empleado, que son posibles, en gran parte, debido a que el número de alumnos matriculados en la asignatura (una media de 21 en los dos últimos cursos), es adecuado para este tipo de experiencias. Con este planteamiento, el 95% de los alumnos matriculados (100% de los presentados) superaron la materia y demostraron un alto nivel de asimilación de los conceptos

Repositorio Institucional de la Universidad de Alicante

How Important Is Data Quality? Best Classifiers vs Best Features

Author: Alonso-Betanzos Amparo
Bolón-Canedo Verónica
Morán-Fernández Laura
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] The task of choosing the appropriate classifier for a given scenario is not an easy-to-solve question. First, there is an increasingly high number of algorithms available belonging to different families. And also there is a lack of methodologies that can help on recommending in advance a given family of algorithms for a certain type of datasets. Besides, most of these classification algorithms exhibit a degradation in the performance when faced with datasets containing irrelevant and/or redundant features. In this work we analyze the impact of feature selection in classification over several synthetic and real datasets. The experimental results obtained show that the significance of selecting a classifier decreases after applying an appropriate preprocessing step and, not only this alleviates the choice, but it also improves the results in almost all the datasets tested.This work has been supported by the National Plan for Scientific and Technical Research and Innovation of the Spanish Government (Grant PID2019-109238 GB-C2), and by the Xunta de Galicia (Grant ED431C 2018/34) with the European Union ERDF funds. CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia”, supported in an 80% through ERDF Funds, ERDF Operational Programme Galicia 2014–2020, and the remaining 20% by “Secretaría Xeral de Universidades” (Grant ED431G 2019/01). Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431C 2018/34Xunta de Galicia; ED431G 2019/0

Repositorio da Universidade da Coruña

Una aproximación al Espacio Europeo de Educación Superior basada en el desarrollo de proyectos software en Ingeniería del Conocimiento

Author: Alonso-Betanzos Amparo
Guijarro Berdiñas Bertha
Publication venue: Universidad de Zaragoza. Escuela Universitaria Politécnica de Teruel
Publication date: 01/01/2007
Field of study

En esta ponencia se presenta una propuesta para la docencia en la asignatura de Ingeniería del Conocimiento de la Ingeniería en Informática. Esta propuesta supone un esfuerzo de cara a la adaptación de dicha asignatura al Espacio Europeo de Educación Superior, para la que uno de los principales problemas suele ser el elevado número de alumnos en las aulas. En este artículo se expone cómo hemos gestionado este problema para poder llevar la adaptación de la asignatura, utilizando el aprendizaje orientado a proyectos, y las ventajas e inconvenientes encontrados. Además, el sistema utilizado, con el que en general hemos obtenido resultados positivos, puede ser fácilmente extrapolable a otras asignaturas presentes en los planes de estudio de las Ingenierías Informáticas, como aquellas relacionadas con la Ingeniería del Software

Repositorio Institucional de la Universidad de Alicante

On the Effectiveness of Convolutional Autoencoders on Image-Based Personalized Recommender Systems

Author: Alonso-Betanzos Amparo
Blanco Eva
Bolón-Canedo Verónica
Remeseiro Beatriz
Publication venue: 'MDPI AG'
Publication date: 13/03/2020
Field of study

[Abstract] Over the years, the success of recommender systems has become remarkable. Due to the massive arrival of options that a consumer can have at his/her reach, a collaborative environment was generated, where users from all over the world seek and share their opinions based on all types of products. Specifically, millions of images tagged with users’ tastes are available on the web. Therefore, the application of deep learning techniques to solve these types of tasks has become a key issue, and there is a growing interest in the use of images to solve them, particularly through feature extraction. This work explores the potential of using only images as sources of information for modeling users’ tastes and proposes a method to provide gastronomic recommendations based on them. To achieve this, we focus on the pre-processing and encoding of the images, proposing the use of a pre-trained convolutional autoencoder as feature extractor. We compare our method with the standard approach of using convolutional neural networks and study the effect of applying transfer learning, reflecting how it is better to use only the specific knowledge of the target domain in this case, even if fewer examples are available.This research has been financially supported in part by European Union FEDER funds, by the Spanish Ministerio de Economía y Competitividad (research project PID2019-109238GB), by the Consellería de Industria of the Xunta de Galicia (research project GRC2014/035), and by the Principado de Asturias Regional Government (research project IDI-2018-000176). CITIC as a Research Center of the Galician University System is financed by the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia) through the ERDF (80%), Operational Programme ERDF Galicia 2014–2020 and the remaining 20% by the Secretaria Xeral de Universidades (ref. ED431G 2019/01).Xunta de Galicia; GRC2014/035Gobierno del Principado de Asturias; IDI-2018-000176Xunta de Galicia; ED431G 2019/0

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

Crossref

Regression Tree Based Explanation for Anomaly Detection Algorithm

Author: Alonso-Betanzos Amparo
Eiras-Franco Carlos
López-Riobóo Botana Iñigo Luis
Publication venue: 'MDPI AG'
Publication date: 18/08/2020
Field of study

[Abstract] This work presents EADMNC (Explainable Anomaly Detection on Mixed Numerical and Categorical spaces), a novel approach to address explanation using an anomaly detection algorithm, ADMNC, which provides accurate detections on mixed numerical and categorical input spaces. Our improved algorithm leverages the formulation of the ADMNC model to offer pre-hoc explainability based on CART (Classification and Regression Trees). The explanation is presented as a segmentation of the input data into homogeneous groups that can be described with a few variables, offering supervisors novel information for justifications. To prove scalability and interpretability, we list experimental results on real-world large datasets focusing on network intrusion detection domain.This research was partially funded by European Union ERDF funds, Ministerio de Ciencia e Innovación grant number PID2019-109238GB-C22, Xunta de Galicia through the accreditation of Centro Singular de Investigación 2016-2020, Ref. ED431G/01 and Grupos de Referencia Competitiva, Ref. GRC2014/035Xunta de Galicia; ED431G/01Xunta de Galicia; GRC2014/03

Repositorio da Universidade da Coruña